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ABSTRACT 


The dead time in a Naval Training Pipeline is defined as time spent by enrolled 
students doing things other than training. There are eight major categories of dead time 
and their effect has been to decrease the utilization of personnel to under 70% in recent 
times. Twenty-four courses for four years (1996-1999) have been selected for study. The 
Academic Setbacks for course with CDP identifier 6400 has been chosen for initial work. 
and model building. The methods developed for this case will be applied to the others to 
the extent possible. The exploratory analyses will seek to discover internal patterns of 


setbacks. Failing this, the process will be declared as time homogeneous and in a steady 


State. 
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I. INTRODUCTION 


A. PROBLEM PRESENTATION 


The deadtime in a Naval Training Pipeline refers to situations in which students 
are enrolled for training but not undergoing training. There are a number of reasons, e.g., 
waiting for a seat in a class, waiting for a transfer to the next training command or to the 
fleet, waiting for discharge from Naval service, or having to temporarily come out of a 
class once it has started. The deadtime issue has attracted more attention under the 
current Navy environment in which cost cutting and manpower downsizing is 
emphasized, especially since its effect has decreased the utilization of personnel to under 
70% in recent times. Reducing deadtime is beneficial for both the Navy, which pays for 


it, and the sailor who endures it 


B. RESEARCH BACKGROUND AND PRIMARY QUESTION 


Two studies are cited for background. Belcher (1999) considers student not-under 
instruction time and reveals the impact and contribution of eight major categories of dead 
time. Belcher’s document analyzes causes and recommends methods to decrease the time 
awaiting instruction (AI), awaiting training (AT) and instruction interruption (II). In 
another study, Rhoades (1998) suggests information systems for integrating the Navy’s 
recruiting, training, and assignment in order to optimize the entire system. 

These above studies identified deadtime and its cost to the Navy. Another 


important issue is that of identifying those time periods during a course of instruction that 





experience the beginning of deadtime. This thesis develops a method to identify these 


deadtime bottlenecks. 


C. THESIS OUTLINE 


The next chapter, Methodology, introduces the foundation of Poisson regression. 
The third chapter describes the given data. Chapter IV, Model Fitting, fits the Poisson 
regression model to the selected data and calculates the deviance as measure of goodness 
of fit. The following Chapter V computes, interprets, and analyzes the output, and 


reveals the usefulness of the output as well. Chapter VI concludes the work with some 


recommendations. 


D. EXPECTED BENEFITS OF THIS THESIS 


Using a Poisson regression analysis, we intend to locate the worst deadtime 
bottleneck in a particular course. To simplify the analysis, e consider the Academic 
Setbacks of course 6400 for model building and exploratory data analysis. The analysis 
will reveal any deadtime bottlenecks that should be identified and considered for possible 
administrative action. If there are no bottlenecks, we will declare the process as time 
homogeneous and in a steady state, requiring no adjustment. The methodology used in 


this study can be extended to other Navy courses to identify significant deadtime 


categories. 

















Il. METHODOLOGY 


A. INTRODUCTION 


Poisson regression analysis is appropriate for response variables that have non- 
negative integer values: 0, 1, 2...._ The Poisson distribution is used to describe the 
response; the behavior of the mean value function in various categories is the goal of 
modeling. 

The occurrences of deadtime events of the type in our study are relatively rare. 
Let’s examine the Academic Setback of the course 6400. The number of student 
academic setbacks must be 0, 1, 2... the non-negative integer values. One student 
academic setback is assumed to be independent of any other student academic setback. 
The total number of academic setbacks for a single course are not large, but there are 


many courses, and the overall problem becomes large. 


B. POISSON DISTRIBUTION 


The Poisson distribution has a single parameter; called lambda, 2, which is the 
average or expected number of events per unit of time, i.e. the mean p. Interestingly, the 
variance of the Poisson distribution is also equal to A. The values possibly taken by the 


Poisson random variable are the non negative integers. 


The mathematical expression of the Poisson distribution for obtaining y events, 


given that A events are expected, is 





~A ay 
PY = y)=* - os 





where Y = the Poisson random variable. 
P = the probability of y events given a knowledge of 2. 


i. = expected number of counts, i.e., the mean pL. 
e = the base of the natural logarithm (approximated by 2.71828). 


y = user supplied input. 


c, THE METHOD OF MAXIMUM LIKELIHOOD 

The method of maximum likelihood for the estimation of statistical parameters is 
the one used in this thesis. This method selects the value of A, based upon the data, 
which maximizes the likelihood function of the observed results. A likelihood function 
takes positive values. Often it is easier to work with the log-likelihood function than the 
likelihood function itself. Since the logarithmic function is a monotonically increasing 
function, the estimator that maximizes the log-likelihood function will maximize the 


likelihood function as well. This log likelihood function takes negative values. 


D. THE POISSON REGRESSION PROCEDURE 


The Poisson regression procedure hypothesizes a model to explain the observed 
data. The maximum likelihood method is used to estimate the parameters of the model. 


The most general case of a Poisson regression (the saturated model) defines an 


individual A for each data point in a sample of size N: 

















N ~A, 4; 
ba) a) 2.2 


i=] i 


A Poisson regression model would define some relationship among the ),;: 





i=] y;! 


N ~A, Vi 
Ut }TS a | 2.3 
and A, is estimated by y, in the saturated model. 


E. THE MEASURES OF GOODNESS OF FIT 


One measure of how closely the Poisson regression model fits the observed data is 


called the deviance D(2:) for the regression model (Kleinbaum, page 503): 


D(z, )=-2 of 2024 2.4 


L(y;A) 
where Ly; z,) is the estimated likelihood of the proposed model, 


and L(y; A) is that of the saturated model. 


The better the Poisson regression model fits the observed data, the closer 
Ly; A, y/ L(y;4) gets to one. Since both the numerator and denominator are maximum 


likelihood estimators, the D(z) statistic is approximately a chi-square variate with N-K 


degrees of freedom. K is the number of parameters in the model. 
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Ill. STATISTICAL DESCRIPTION OF DATA 


The given data include 24 courses from 1996 to 1999. Our first task 1s to explore 
the selected data sets for their statistical properties. Table 3.1 reveals the structure of the 
data for a single course, a single category of dead time raw data, which consist of eight 


Table 3.1. The Abstract of Raw Data of Academic Setback from 1996 to 1999. 
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columns. 


These eight columns contain: the fiscal year (ENROLLFY), the course 
identification seGtoe (CIN), the category of deadtime (CATEGORY), the course 
number (CDP), the course length (CRSELEN), the day number into the course (DAY) of 
the event, the numbers of students entering deadtime on that day (COUNTS), and the 
deadtime reason (ABBRNM). For simplification of the initial work and model building, 
we selected the Academic Setback in CDP 6400. Therefore the CATEGORY, the CDP 
and the CRSELEN are AS, 6400 and 164 days and are fixed in this example. We focus 
on the DAY and the COUNTS of the events. 

Table 3.2 organizes the information into how many events happened for each day 
into the course. For instance, four students received academic setbacks on the 14% day of 
the course, one student on the 99" day, but no students on the 19th day, etc. If there was 
no student setback on a day, such as the 19", the original data set did not include a record 
for DAY = 19. Thus the day 19 does not appear. We did find.some students having a 
setback after 164 days. These are viewed as miss-entries and are ignored. 

Table 3.3 records the frequency of the various COUNTS. It records the number 
of days for each category of COUNTS. For example, out of 164 days, there were no 
setbacks declared on 70 of the days and there were exactly one on 35 of the days, etc. The 
variable COUNTS in Table 3.3 takes on eight values: zero, one, two, three, four, five, six 


and seven. The total proportion of counts for one, two and three is 49.38%, while the 


total for four, five, six and seven is 7.93%. 














Table 3.2. Frequency Table by DAY. 
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Table 3.3. Frequency Table by COUNTS. 
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Figure 3.1! graphically illustrates the distribution of the setbacks. To obtain the 
general trend in the data, we group seven day sets into weeks (ten days in the last period). 
Since there was not enough data for 1999, it was pooled with 1998 data for display. 

The peaks suggest time bottlenecks marking the student setbacks. In both of the 
years 1996 and 1997, the peak value happened in the sixth week, but in the year 1998+99 
it did not. Rather than having a common peak location for each year, the peaks move 
back and forth. At this point, we do not know whether the cyclical effect is real, or 
merely an artifact of randomness. The graphs may be misleading, because these peaks 


move when we change the size of the grouping. 


Figure 3.1 The Distribution of AS-6400 n COUNTS per Week 
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Figure 3.1. The Distribution of AS-6400 in COUNTS per Week from 1996-to 1999. 


1 The data in year 1999 included five observations only, therefore we combine them into the year 1998 and 
mark as 1998+99. 














IV. MODEL FITTING 


A. MODEL DESCRIPTION 


We describe the observed setbacks using DAY as the sole explanatory variable. 
The sequence of course days is divided into K intervals. Within an interval, the A is the 
same for each day's Poisson distribution. Between intervals, the As can be different. 

For example, if K=3 and suppose the total number days in a course is N=150, the 
first interval could contain the first 60 days. The second interval could include the next 
40 days, and the last piece includes the remaining 50 days. A different partition might 
use intervals of lengths 34, 57, and 59. 

The parameter is the expected value of the response variable in a Poisson 
process. It is the rate of the counts on a day and can be repersented as dX. = A(days) by a 
mean value function, that is the function of the explanatory variables. Since the As are the 
same for each day in an interval, the maximum likelihood estimator of the interval's A (the 
Academic Setback rate) is equal to the sum of the setbacks divided by the number of days 
in the interval. 

To illustrate the idea’ of the explanatory variables DAY and interval, consider 
Table 4.1 below, Figures 3.1, 4.1 and the output of program read.gam, which we will 
discuss later. In Table 4.1, we sum the counts to get a response from the DAY 14 to 125, 
which is around the 3” to the 18” week in the Figure 3.1 as well as the 2° interval in 


Figure 4.1. Then we divide the responses by the number of days in the interval 


I] 





Table 4.1. The COUNTS Corresponding to the DAY, Interval and Week. 
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Figure 4.1 The Five-Interval Policy of AS-6400 in 1996--1999 
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Figure 4.1. The Five-Interval Policy of AS-6400 in 1996-1999. 
to get the average rate. 

Note that the rate in Figure 3.1 varies from week to week but in Figure 4.1 it isa 
constant within the chosen intervals. A week, being a fixed interval that slices the course 
length mechanically, disguises the trend of the curve. Consequently, we model A(days) 
as a simple step function and choose breakpoints by maximum likelihood. This will 
follow the trend with variable length intervals, so that the As are constant over well- 


selected intervals of days. 


B. EXAMPLE 





The course with CDP number 6400 is used as a tangible example to illustrate the 
method. The course has N = 164 days. Let Y be a Poisson random response for a 


particular day. Thus Y is the number of setbacks (the variable COUNTS) observed that 





day. The N days are partitioned into K intervals. It is convenient to describe the partition 
by a set of breakpoints b,, j=1...K. The breakpoints are the indices of the last day in each 
interval where the days are numbered consecutively from day | to day N. 

The observed values of Y from Table 3.1 are y, = 0, y,, = 3, and such. The course 
is N=164 days long. Let K=5 for a five interval partition. Pick breakpoints at 13, 125, 
132, 133, and 164. 
| Under these conditions we can calculate the number of days in each partition: 
p.=13, pj=125-13=112, p,=132-125=7, p,=133-132=1, and p,=164-133=31. 

To calculate the maximum likelihood estimator 4; for an interval, sum the 
observed number of setbacks and divide by the number of days in the interval. For 
example, for interval 5, the sum of occurrences is 1+1+1+2+1+1+1=8. The maximum 


likelihood estimator of A, is 8/31=0.258. Note that for both intervals 1 and 3, the A 


estimate is 0. 


C. METHOD 


Let Y,...Y, be N independent Poisson random variables, one for each day in the 
course. The days in the course are partitioned into K intervals and all of the Poisson 
variables associated within an interval have a common parameter. 

The partition can be described in two ways. Both are given because some of the 
equations are greatly simplified by using one notation or the other. 


Take p,...p, where p is the number of days in an interval. The sum of the p, 


through p, is N. 


14 

















We take b,...b, as the breakpoints of each interval. The b; is the index of the last 
element in interval }. The index starts from day 1. Note that b, = p,, b, = p, + p,, and so 
on. The last breakpoint, b,, equals to N. To simplify notation later on, we define b, = 0. 


The likelihood function of the saturated model is 
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The log-likelihood function of the saturated model 1s 
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For the Poisson regression model we have K intervals. Within an interval the 
Poisson distribution for each day uses the same 2. Let A’ be the K sets of As of the 


regression model, then the likelihood function of the regression model is 


b, say BB 
uy:4)=T] [] — 
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4.3 





The log-likelihood of the regression model has the complex form 


K K b; N 
ndbsé Sma, +3) nay] S|)» 4.4 
jel jel i=b;_) +1 i 
We want to select the partition, p,...p,, which maximizes the log-likelihood of the 
regression model. Identifying best partition is computer intensive and is accomplished 


using a program implementing the network shortest path algorithm. This was 


accomplished with the help of Dennis Mar of the Systems Management Department and 
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Professor Lawphonpanich of the Operations Research Department. An palling of the 
method is presented in Appendixes A, B, and C. 

The measures of goodness of fit of Poisson regression models are obtained from 
the comparison of maximized likelihood values. We use the deviance to produce 
likelihood ratio tests for assessing the goodness of fit. 


The log-likelihood ratio statistic has the form: 





b(e)=-21q 2] 15 


where D(a’) is the deviance for the regression model 

If the model is a valid one, then the D(z’) statistic has approximately a chi-square 
distribution with N-K degrees of freedom. As the log-likelihood of the regression model 
increases, the deviance statistic decreases. If the deviance is large and the chi-square 
distribution test rejects the null hypothesis that the step function model is tenable. This is 


evidence that the regression model does not fit. 
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V. COMPUTATION AND ANALYSIS 


A. COMPUTATION 


This chapter traces the methodology of fitting the models to the data, and judging 
goodness of fit. The steps are to maximize the log likelihood (or minimize the negative of 
the log likelihood as that quantity is more directly related to the deviance statistic), and 
use the Chi-square distribution to judge goodness of fit. Three programs are utilized: two 
programs read2.gam and read.gam were created by Professor Lawphonpanich and a 
SAS program was developed by Dennis Mar. The guidance for using these programs is 
explained in the Appendix E. 

The read2.gam (Appendix A) calculates the negative maximum log likelihood for 
any K value specific interval policy, e.g. a three intervals policy, five intervals...etc. Its 
output provides us output about the start numbers of intervals for that policy. 

The read.gam (Appendix B) finds the best choice of contiguous intervals, such as 
(13, 112, 17, 1, 31) for a five interval policy of AS in course 6400 from fiscal year 1996 
to 1999. These are the interval lengths; the break points are DAY 13, 125, 142, 143, and 
173. This is the best choice for a five interval policy. 

The SAS program includes procedures Data Step and Deviance & Alpha (a) 
Value Step. The data step selects and organizes data from the raw data and prepares 


formatted data to the gam programs. Then the deviance & alpha value step computes the 
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deviance and calculates the chi-square test statistic distribution upper tail confidence 


level. 
Tables 5.1, 5.2, 5.3, and 5.4, consolidating the outputs of AS course 6400 in 1998 


from those three programs, are used to interpret the output, illustrate goodness of fit, and 


analyze the output. 


B. INTERPRETATION OF OUTPUT 


Table 5.1 is the output of the read2.gam. The values in the first column signify 
the number of intervals in the policy: three to 10. The values in the second column, is the 
negative of the maximum log likelihood value. In the way that read2.gam calculates, 
smaller is better. Comparing the difference of the value between three and four (2.59), 
four and five (8.64), five and six (2.74) interval policy, the pair four and five has the 
biggest change. This biggest marginal value in these three pairs suggests the five interval 


policy is plausibly a reasonable first choice for the next step. 


Table 5.1The Negative Maximum Log Likelihood of Various Interval Policy. 


for AS6400_98 


Value _ 





3 





7 67.9204 
8 66.0098 
9 62.5889 
10 59.9182 
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Table 5.2 includes output of program Read.gam and SAS Deviance & Alpha 
Value Step. 

The Read.gam output shows the negative of the maximum log likelihood, 
breakpoints, indicators and average rate of the partition. The negative of the maximum 
log likelihood value is 73.6616, which is the same as Table 5.1, the five interval policy. 
The first column noted as [s .D14], [D14 .D16] is the beginning and the ending day of 
interval. S is the start day of the course and the D14 is the 14" day of the course...etc. 
The 1* interval is from s to D14, the 2", is from D14 to D16...etc. The breakpoints are 
(b,, b,, b;, b,, bs) = (14, 16, 77, 105, 164) and the lengths of intervals are (p,, p,, P3, Pas Ds) 
= (14, 2, 61, 28, 59). 


Table 5.2 The Best Five Intervals, Deviance and a Value of Chi-Square Test. 


Five Partitions Policy 


VARIABLE TOTCOST.L = 73.6616 negative log likelihood 
PARAMETER output 
x A 
7 i 100000 ae 
D14_.D16 1.0000 25000 
D16 .D77 b<0000 0.1475 
Dis 2 DLS 1s 0000 O- FooT 
D105 .D164 1.0000 


The SAS System Output of 6400 AS 98 for 5 Partition Deviance & Alpha Value 
COP ee kee 
1 155,832 0 0.55624 


The X column is a binary variable, which indicates whether an interval was 
selected for the final model: 1 is for selected and 0 is not. The zeros do not appear in the 
output. In our formulation of the problem only the included intervals are showed. The X 


value 1s always 1 therefore we can ignore it. 
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The A column is the average rate (A) for the interval. A missing value in the A 
column implies 0. The average Academic Setback rate of partition [D14 .D15] is 2.5 
counts for each day during this period of time. 

In the SAS output of deviance and alpha value, the deviance is 155.831 and the a 
value is 0.55624 with 159 degrees of freedom (=164 — 5, course length minus numbers of 


intervals). The. value will be discussed more in the Goodness of Fit section... _ 


c. GOODNESS OF FIT 
Under the null hypothesis, H,: K partition model fits the observed data. The 
distribution of the deviance statistic is Chi-square. Let a be the probability that the 
deviance random variable is greater than or equal to the realized deviance statistic. At the 
5% level of significance, calculated values of a greater than 0.05 supports the null 
hypothesis. Consider five, three and seven interval policy first (Tables 5.2, 5.3 and 5.4). 
The alpha (a) value indicates the probability that we would observe a deviance 


value of that size or smaller when the null hypothesis is true. This alpha level is 0.06381 


Table 5.3 The Best Three Intervals, Deviance and a Value of Chi-Square Test. 


AS6400 98 1 

Three Partitions Policy 
VARIABLE TOTCOST.L 
PARAMETER output 


84.8867 negative log likelihood 


x A 
Ss -D77 1.0000 0.1818 
D77 .D105 1.0000 0.7857 
D105.D164 1.0000 Os LOTS 


The SAS System output 

6400 AS 98 for 3 Partition Deviance & Alpha Value 
Obs dev Alpha : 
1 187.005 y0638i 


£8 ian § Botta 
Mie hd OW) Cok 
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for the best three interval policy, 0.55624 for the best five-interval policy, and 0.82357 


for the best seven interval policy. 


Table 5.4 The Best Seven Intervals, Deviance and a Value of Chi-Square Test. 


Seven Partition Policy 
VARIABLE TOTCOST.L = 67.9204 negative log likelihood 


PARAMETER output 


X A 

Ss .D14 1.0000 

D14 .D16é 1.0000 22 9000 
DEG: =DSzZ 1.0000 

D352: D335 1.0000 2.0000 
DSS D7 4 1.0000 Ose dogs 
Dy «DiS 1.0000 027657 
D105.D164 1.0000 OnLOly 


The SAS System Output 
6400 AS 98 for 7 Partition Deviance & Alpha Value 


Obs dev __Alpha 
1 140.482 0.82357 


The results lead to a basic dilemma. How many intervals are suitable for the 
analysis and requisite recommendations? This is a trade-off between the number of 
intervals K and the goodness-of-fit statistic. The three-interval policy is desirable 
because of its simplicity. But while its deviance value would not be rejected at the .05 
levels, it is close. The practitioner could reasonable select between the five- and seven- 
interval policies. For the remainder of this thesis, the seven-interval policy will be 


studied. 


D. ANALYSIS OF OUTPUT 


We construct Figure 5.1 from the output of the seven interval policy for analysis 


due to its higher confidence level. Referring to the figure, the 2°", the 4" and the 6" 
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intervals attract our attention more than others. This seven interval policy follows a low- 
high pattern. The rates of setback are 0.00, 2.50, 0.00, 2.00, 0.16, 0.79, and 0.10. The 1* 

and the 3" intervals consisting of 14 days and 16 days were near the beginning of the 
course,where low setback rates are expected. The 2™ and the 4" consisting of two and 
one days may reflect the learning problems from previous intervals which were not dealt 
with until those particular days. Looking beyond the first four intervals, the last three 


show up in the three interval style that we prefer. 


Figure 5.1 The Seven-Interval Policy for AS of Course 6400 in 











1998+99 
4 
| Rate 3 
Do - 
] 
DAYS 1416 2233 1 iH 164 
Interval st 2nd 3rd 4th oth 6th Tth 
DAY/Interval 
‘= Rate(estimated) —--—Daily Count. 


Figure 5.1. The seven-interval policy of AS-6400 in 1998+. 
Following the low-high pattern, the last three intervals exhibit a specified style: 
increasing interval — high setback rate interval — decreasing interval. The 5" interval 


included 44 days, the 6" interval 28 days and the 7" interval 59 days. The 6" interval 


Ze 











E. ADVANTAGES OF OUTPUT 


Compare the daily count to the interval rate on the figure 5.1, the interval rate 
simplifies the curve and our study. Taking advantage, we try to use the seven interval 
policy to be our general policy. Its use is to identify common periods of time having a 
commonality of concerns in the course. 

Figure 5.2 consolidates the seven interval policy over the four years. ., The four- 


year sum curve accumulates all contributions from the four years of data and displays 





some stable rates except for the 3™ interval (day 39 to 40) and the 6" interval (day 133). 
The rates in the 1° interval (0.00), the 5" (0.00) and the 7™ (0.25) are approximately equal 


as well as the rate in the 2™ (1.75) and the 4" (1.6353). 


Figure 5.2 The Seven-Interval Common Policy for AS of Course 6400 
from 1996 to 1999 


|] 

8 . 

' 
‘ 5 5 NUBUBHRI OT BL CH HH 
¢ 1 4! i 
RELA RANA A TPIT MEE SED 
eS 

Fi Or Oe ded a So Sak Cc ie TE Loe ee nc CE TT a rhe abdie na 


PHU HAHK Oe DET SENN] 


13. 25 37 49 61 73 85 97 109 121 133 145 157 
dnd th 7th 
Day/Interval 








Figure 5.2. The Consolidated Common Policy of AS-6400. 
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The rates in the 1* interval (0.00), the 5" (0.00) and the 7" (0.25) are approximately equal 
as well as the rate in the 2’ (1.75) and the 4" (1.6353). 

The consolidated figure gives us an advantage to know the common behavior of 
the intervals when we compare them from year to year. Obviously the 3" interval in the 
four-year sum curve catches more significance than the 5" interval. At nearly the same 
period of time, from day 32 to day 45, the rate rises promptly. This stage includes the 2" 
interval for year 1997 with a stable rate 1.2581 in the long term day 13-88, the 2" interval 
for year 1996 with a rate 1.7143 during day 35-42 and the 4" interval with a rate 2 


on the day 32 for year 1998+99. Therefore, the days 32 to 45 of the course will have the 


first priority for administrative attention. 
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VI. CONCLUSIONS 


To illustrate further how the developed method works; we test it on Academic 
Attrition and Instruction Interruption (exclude Holiday Leave) of the course 6400. The 


Table 6.1 displays the result. 


Table 6.1. The Test Output of AA and IJ in Course 6400. 








Read.gam output for AA6400 mix 
82.0508 negative log likelihood 
X A 
S: ° DZ4 1.0000 
DZ21:2.091 1.0000 0.1857 
D91.D164 1.0000 2287) 
The SAS System output for AA 6400 m 
Obs dev Alpha 
1 144.834 OL 78277 
The SAS System output for course II6400 m 7 partition 
Obs dev Alpha 
1 536.684 0 
| The SAS System output for course II6400 96 7 partition 
Obs dev Alpha 
a8 588.022 0 
The SAS System output for course II6400 97 7 partition 
Obs dev Alpha 
1 591.873 0 
The SAS System output for course II6400 98 7 partition 
Obs dev Alpha 
ii 471.576 0 
The SAS System output for course II6400 99 7 partition 
Obs dev Alpha 
1 363.443 0 


The three interval policy for academic attrition achieves a = 78%. We do not to 
reject the null hypothesis for Academic Attrition. The 3" interval is the interesting period 


with the highest attrition rate. Turning to Instruction Interruption, the test rejects (a = 0) 
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for the seven interval policy and all years. One must use more intervals for Instruction 


Interruption in course 6400. 


A CONCLUSIONS 


This study has applied the developed method on three categories of deadtime and in 
course 6400 from 1996 to 1999: the Academic Setback, the Academic Attrition and the 
Instruction Interrupt; and concludes with three results. First, the bottlenecks happen 
approximately around the 32" to 45" day for the academic setback. Second, the 
bottlenecks happen on 92" and 93" days for the academic attrition with 78% confidence. 
Third, the Instruction Interrupt needs more than seven intervals to reach a satisfactory 
Chi-square test. It is a candidate for the time homogeneous process. 

The developed estimators detect the location of the weaknesses by course and 
category of deadtime for a given data set. Since so many courses are taught and so many 
categories of deadtime exist, it’s not possible to locate all of the possible problems with a 
common model. Finding the location of the weakest point is always the first priority. 

Recall that we assumed the deadtime incidence rate is constant if the course is in a 
stable status. The developed estimator calculates rates from best choice intervals. Long 
intervals indicate stability, short ones suggest a transient nature. This type of instability 


needs further research. 
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B. RECOMMENDATIONS 


This thesis is a pilot study and the developed estimator provides a flexible tool for 
the task. The possible future studies include: 
e Develop a user-fnendly program which executes the same function as this 
thesis. 
e Analyze the contribution and relationship of the reasons to deadtime in the - 


concerned interval. 





Zi 
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APPENDIXES 


APPENDIX A: READ2.GAM 


This program calculates Maximum Log Likelihood value. The bold head prints 
should be changed according to the length of the course, the name of the data file and the 


numbers of interval for policies. 
STITLE *Gameside’s Program for policy Log Maximum Likelihood 


SOFFUPPER OFFSYMLIST OFFSYMXREF INLINECOM{ } 

OPTIONS RESLIM = 900, ITERLIM = 100000 
LIMCOL = 0, LIMROW.= 0, DECIMALS = 4, SOLPRINT = OFF 
OPTCR = 0.05 
LP = OSL; {**OSL has a network solver**} 


Set 
si /s, D1*D164/ 
Parameters 
y (1) / 
Sinclude as6400 D 98.prn 
Ve 


Set. are Cis.) 
Alias (1,3,K); 


Scalar NGrp ; 


Parameters 

a(i,j) optimal a 

c(i,j) obj value 

D1) A 
acy es (Ord ).) dt -ord(2)°): = 

sum(k$ (ord(k) gt ord(i) and ord(k) le 

Ord(j)); 1 (kK x) ) / (ora (3) = GLC( a) jy 
Clix lS (Ora) gt. ord (2) )-= (ord (7); = ore...) ) sali, 7) 
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- sum(k$(ord(k) gt ord(i) and ord(k) le 
Ord. (7) ) 71k) J tog (ats J7 9 
(a(i,j) gt 0); 
arcti-;7)) = 2S (Ord(4): ot ord{a) > 


DCL). “= 1S (ord{i): ‘eq -1) =—1S (ord) €q card(1)); 
*display c; 
*POSITIVE VARIABLE 


Binary Variable 
X(i,j) amount of flows on each arc; 





VARIABLE 
TOTCOST negative log likelihood; 


EQUATIONS 

OBJ define objective function 

FLOWBAL(i) ‘fiow conservation 

numint; 
eT eG I YN TO or een Rr ERR ey a ee a OC NR A Re a a Ge Pa ee aN 
OBJ. 

TOTCOST =E= SUM((i,j)$arc(i,j),c(i,j)*X(i,4)); 

FLOWBAL(I).. 

SUM(jSarc(i,}) patil )=SUM (7s are(),2.) pl Np) )=E=b (1); 
NUMINT.. 


sum((i,j)$arc(i,j), X(1,j3)) =L= ngrp; | | 
MODEL MCFLOW / ALL /; 


Set iter /1*365/; 


Parameter report(*,*), sol(i,jJ); 
Scalar old /99999/; 


Loop (iter$(ord(iter) ge 3 and ord(iter) le 10), 
ngorp = ord(iter); 
solve mcflow using MIP minimizing TOTCOST; 








report (iter, 'Value') = Totcost.l; 
IF (totcost.1 1t (old-0.0001), 
sol(i,j) = X.L(1,3); 
old = totcost.1; 


); 

); 
OpELOn: Yr23021;3 
display report; 
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APPENDIX B: READ.GAM 


This program selects the best breakpoints of the interval under the maximum log 
likelihood value. The bold head prints should be changed according to the length of the 


course, the name of data file, and the number of interval. 
STITLE * * Gamside’s Program for Five Pieceses Policy * * * 


SOFFUPPER OFFSYMLIST OFFSYMXREF INLINECOM{ } 

OPTIONS RESLIM = 900, ITERLIM = 100000 . 
LIMCOL = 0, LIMROW = 0, DECIMALS = 4, SOLPRINT = OFF 
OPTCR = 0.05 
LP = OSL; {**OSL has a network solver**} 


set 
i /s, D1*D164/ 
Parameters 
y (1) y 
Sinclude as6400 D 98.prn 
i 


Set arc(i,i); 
Alias (i,3,k); 


Scalar NGrp ; 


Parameters 

a(i,j) optimal a 

c(i,j) obj value 

5-0) ; 
a(i,j)$(ord(j) gt ord(i)) = 

iy phoning gt ord(i) and ord(k) le 

eyarek Gye) perm a i .< K)) / (Ox d(5 ) = Ord) 74 
ei, 3)s (Ord) Gt ord i.) (ord(j) - ord(i))*a(i,j) 


) = 
- sum(k$(ord(k) gt ord(i) and ord(k) le 
ord (j)}),7 (EE) log iat) 9 
talig je “Ge O74 


aro(l,). = YESS (ord ()). ge ord(1)-); 





b(i) = 1$(ord(i) eq 1) -1$(ord(i) eq card(i)); 
*display c; 
*POSITIVE VARIABLE 


Binary Variable 
X(i,j3) amount of flows on each arc; 


VARIABLE 
TOTCOST negative log likelihood; 
a a Sas aad as Aa aus em Wes pel GR noe Sen” NY “SST JUNO kms’ Games met i” lel OSes Sek es ce eee ees eae “ol 
EQUATIONS 
OBJ define objective function 
FLOWBAL(i) ‘flow conservation 
numint; 
a a a i ace tap emai" me hi my aA a i a ey Pl a ee lp pe “ame eee fm 
OBJ 
TOTCOST =E= SUM ii, j) Sarre (iy, 3) 7c Ct: Pi |b X(1 Pale ee 
FLOWBAL(I).. 
SUM (jSarc(i,j),X(i,j))-SUM(jSarc(j,1),X(j,1))=E=b(1); 
NUMINT.. 
sum ( (259) Sere (x3) Alia: Hb negrp; 


MODEL MCFLOW / ALL /; 


ngrp = 5; 
SOLVE MCFLOW USING MIP MINIMIZING TOTCOST; 
DISPLAY TOTCOST.L; 


parameter output(i,j,*); 


ne ) X.L(i,j); 


CUutput (1; 7; = ; 

ee je YS (Ae be 4) = 1) = 
m(k$(ord(k) gt ord(i) and ord(k) le 

ord(3))s¥ (k).) 7 (ord()) = ord (i) jy 


display output; 
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APPENDIX C: SAS PROGRAM 


1. Data Step 


The Data Step of this program selects the desired data from the raw data. The 
bold head prints should be changed according to the deadtime CATEGORY, CDP and 
ENROLLFY of desired data, the course length, degree of freedom, and the breakpoints. 


Data Step 


KAKKKKKKKKKKKKEKKKKEK » 


Sear. SCleECl. Gata. Seer, 
wk RK Rk KKK KK RK KR KKK KK KK 
data datal; 
set diskh.spbsnum; 
if category='AS'; 
Li “Cdp ='6400'; 
if enrollfy=1998; 


KHKKKKKKKKKKKEKKEKRKKKK KK KKK KR KKK KKK RK KKK KKK KR KKK » 


wax COUNC: NUMDEe? -OF Setbacks Gach day ****> 
KHKKKKKKKKKKKAKKKEKKEKKKEKKKKKKKEKK KKK KKK KKK KKK » 
Proc freq data=datal noprint; 

table days / out=data2; 

weight count; 
Kk KKK KKK KKK KKK aK ARK KKK KKK KKK KKKKKKKKK KK KKK KKK KKK KKK » 
**x*x* Remove any data for any day greater than the ****; 
**x*k* maximum length of the course. ae 
KaEKKKKKKKKKKKKKK KKK KK KER KKK KKK KK KKK KKK KKK KKK K s 
data dataz2; 

set dataz2; 

if days>164 then delete; 


drop percent; 
KHKKKKKKKEKKKAKKEKKKEKEKEKKEKEKEKEKKKEKKEEKKKKAKKKKEKKKKKEKKKEKKEKEKKKEEKE 9 


**** Create a data set where each observation is a day***; 
*#kk*x of the course. KK 
HKHEKKKKKEKKKKKAEKKEKKKEKEKEKEKEKEKEKREKKKEKKKKEKEKRKEKEKEKKEKKEKRERE KER 9 
data data3; 

do days=1 to 164; 


output; 
end; 
KHKKKKKKKKKKKKKKAKKKKRKKHRKKEKKKKEKKEKKEEKRKEKKKEKKKKREKKEKKEKER o 
**x**¥* Add in the days with count=0 setbacks. KOK 


eH: NOatas” NaS. an Cntry ©oOr each “day Of. tne course... ** 4%; 
KKKKKKKKKAKKEKKEKKEKKKEKKKEK KRHA KKKKKKRKEKKKKEKKKEKKKKRKEKKEKKKEEKEK 0 


Be, 





data diskh.alldays; > 
merge data2 data3; 
by days; 


i= count=. then count=0; 
Tk ok ee ek ok ke ke kk kk kk Rk RK KR RR RK KR KK RR KKK KK KK KK s 


*Create a list of log factorials,1n(0!) through 1n(100!),*; 
Ke KK KKK KK KR RK KR RR KK RR KK KK RK KR RRR KKK KKK KKK KKK KK KKK KKK KKK KKK » 
Proc Princ; 

yon; 


2. Deviance & Alpha Value Step 


The Deviance & Alpha Value Step of this program has to run with the Data Step 
together to calculate the deviance and look up the Alpha value of Chi-square test 
statistical distribution. The bold head prints should be changed according to the course 


length, degree of freedom, and the breakpoints. 


Deviance & Alpha Value Step 


kk Kk KK RK KKK KKK KK KKK KK KKK KKK KK KKK KK KEKE KKK KEKEKEKEK KK KKK KKKKEKSE 
**¥** The incoming data set contains 164 observations. *** 
**** The variable COUNT for the ith observation is the *** 
**** total number of setbacks for the ith training day.*** 
Sok KK KKK KKK KKK KKK KKK KKK KKK KKK KKK KKK KE KKEKKKKEKKEKKEKRKEKKKKEKAKKEKE 
** The transpose procedure changes the arrangement of the* 
** data set. The 164 observations of COUNT are converted* 
** into a new data set with one observation * 
** and 164 variables dl through d1o4. 

** The value, for example, of d34 is equal to the value 
** of COUNT in the 34th observation. 


KKK kk HK KK KKK KKK KKK KKK KKK KEKE KK KKK KEKE KKKEKKKKKKEKKKKKKK KKK KEKE 


x**x**x This reconfiguration is done soley because of the - 
**** style of syntax used by SAS. 


kk ek ok kkk ek kK KK KK Kk KK Kk kK RO KR kk a Kk RR KK KK KK KK KK RK KK s 


* 
* 
* 


* 


proc transpose data=diskh.alldays out=transp prefix=d; 


Var -COunt; 
keke 


**** The variables are added: n (total days in course), 
**** df (degrees of freedom of the chi-square, 
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KkKK* 


pl. 2: p3)-p4 ps 


parcel tion)« 


* kkk 


data allpart; 
set transp; 


dro 
n 
oh 
pl 
D2 
p3 
p4 
Do 


p 


_name ; 

164; 

159; 

bis bl 
b2-b1; b2 
b3-b2; b3 
b4-b3; b4 
b5-b4; b5 


14; 
16; 
a 
105; 
164; 


(number of days in each piece of the 





KHEKKKKHEKKKKKEARKKKKKKKEKKEKRRKKKEKKEKEKKR KKK KR KKK KKK KK KKKKEKEKERE 0 


** The ultimate goal in this data step is calculation of *; 


** the "deviation" for the partition specified by pl 


“* “EChrougn (po. 


AKKAKKKKKAKKKKEKEKKEKKREKKEKKKKREEKE REAR REKEKR KKK KKERKEREKKKKEKREKKR 0 


data allpart; 
set allpart; 
array ff£(1) £1-f£164; 
array mm(i) ml-m164; 
KAKKKKKKKAKKKEEKRKEKEKRKRKEKEKREKKEKEKRKEKKKKKKKKKRKEKKKK KKK 0 


Calculate average setbacks of piece 1.*; 
KKKKEKKKKEKKKEKERKEEEKEREKEREKRKEERKKEKKAKKEKKEKEKKEKEKE o 


keke 


* 


FORO RIN IIR IR RE KID, RA RR IG IR AM RR MR RII RD ee eM Des 


sumx=0; 

do: 2. = 1 £6:.pLy 
sumx=sumx+fFf; 

end; 

meanx=sumx/pl; 

do: 4. = t-te pl 
mm=meanx; 

end; 

kKkek*k 


Calculate average setbacks of piece 2. 


* « 
4 


KKKEKKKKKKEKKKKKKEK KEKE KREKKKKKKKEKKEKEEKEKKEKKKEKEKE RE o 


sumx=0; 
q6 2.= pl4+l to: pl+pz; 
sumx=sumx+fFf; 


end 


4 


meanx=sumx/p2; 
do: 1: = "pli. to-pl+pZ; 
mm=meanx; 


end 


7 
f 


KHKEKKKKKKEKKKEKEKEEKKEEKRKERKKEKRKEKEKEEEKERKEKEEKERERE 9 


ka * 


* 


Calculate average setbacks of piece 3. 


* e 
? 


KKEKAKKKEKEKKEKEKKEKKRKKEKKEKEKEKKEEKEKEKEREKKEEKKEKKEE 0 
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sumx=0; 

do 4; = pltp2+i to: pl+p2+p3; 
sumx=sumx+ff; 

end; 

meanx=sumx/p3; 

do 4 = pltp2+l. to pltpztp3; 
mm=meanx; 


end; 
KAKKKKKKKKKKKKKKKEREKKKKRKEKKEKKKKEKRKKKKKKKKKKKEKKKE 9 


x*** Calculate average setbacks of piece 4. *; 
KKKKKKKKKKAKKKKKKEKKKRKEKKRKK KKK KARR KEK KKKKEKEK KK » 


sumx=0; 

do: 2./=>pltp2ztpstl: to pltpZ2+p31p4y 
sumx=sumx+ff; 

end; 

meanx=sumx/p4; 

do i = pltp2t+p3t+l to pltp2+p3+p4; 
mm=meanx; 


end; 
ik kk kk kk kk kkk RK KKK KR KR ARK KR KK KR KR RK KK KK KKK KKK KK » 


*k*k* Calculate average setbacks of piece 5. *; 
KAKKKKKKEKKKKEKKKEKKKKKKKKRKKEKKKKEKKKKKRKEKKRKKKEKKK KEK K 2 


sumx=0; 

do i = pl+p2+p3+p4+1 to n; 
sumx=sumx+ff£,; | 

end; 

meanx=sumx/p5; 

do i = pltp2+p3+p4+1 to n; 
mm=mneanx; 


end; | 
Kak k KKK KK KK KKK KKK KKK KKEKKKKEKAKKKKKKKKK KKK KKK 


**** Calculate deviation which is -2 times * 
aeee THe og Of the ratio of. the .iketihood* 
**k*kx of the hypothesized model and the * 


**¥k* JTikelihood of the saturated model 
ke eke Kee KK KKK KKK KK KKK KKHRKKEKKHKKKHKEIEKHKEKKEKEKSE 


* 


“Me “we “Ws “Ye “ese 


dev=0; 
do i= 1 to 164; 
if ff LT 1.e-15 then dev= dev —- (ff-mm); 
else dev= dev + ff*log(ff/mm) ; 


end; 


dev = 2?*dev; 
KKKKKKKKKKKKKAAKAKKEKKEKKKHaEKEKKKEKKKEKRKKRKKKKEK KER KK EKKER o 


® 


**kk Caiculate the cumulative probability for the *; 
*k*k*k Chi-square distribution from 0 to dev ny 


*k*k* for df degrees of freedom. a 
KAKKAIKKKKKKKKKKEKEKKKAEKKKEKKEKRKKEAKKEKEKKKRKKEKKKERKEKEKEKKER 8 
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alpha=1l-probchi (dev,df); 
drop ml-m164 £1-f££164; 


BB Mees Pk PRT ee ae Bae ee IRE AR ae ee Der a PETE DN, REE IRON eR Petty ee Be DEON: BI We Oe 


**x* Print the deviance and the Alpha. ar 
PRE TR Bo RR Ie Ne eae Ne I Ie A OM ae Ke NR ORS 


proc print data=allpart; 
var dev alpha; 
title "Five piece partition, category=AS cdp=6400 
enrollfy=1998"; 
run; 
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APPENDIX D: TEST RESULTS 


AS6400 m 7P 
VARIABLE TOTCOST.L 
PARAMETER output 


X 
S <Di3 1.0000 
D13. «D37 1.0000 
D37 .D40 1.0000 
D40- 2D1IZ5 1.0000 
D1I25.D132 1.0000 
DL32.DL33 1.0000 
DI33.D164 1.0000 


The SAS System Output 
Obs dev Alpha 
1 341.2205 0 


AS6400_ 96 7P 
VARIABLE TOTCOST.L 
PARAMETER output 


x 
Ss 3039 1.0000 
D35 .D42 1.0000 
D42 .D62 1.0000 
D6é2 .D63 1.0000 
D63 .D122 1.0000 
D122.D125 i. QO00 
D125.D164 1.0000 


The SAS System Output 
Obs dev Alpha 
1 Laoslos. O.9267 


AS6400_97_7P 
VARIABLE TOTCOST.L 
PARAMETER output 


Xx 
S 2bpis 1.0000 
Dis. » Des £20000 
D88 .D119 1.0000 
Di19.D1izZ5 1.0000 
D1iZ5.Di32 1.0000 
DIi32.D1i33 1.0000 
D133.D164 1.0000 


The SAS System Output 
Obs dev Alpha 
1 249.702 0 


= 90.6703 negative log likelihood 
; | 
1500 


0000 
~6353 


b+ oF 


5.0000 
OaZ2501 


of AS 6400 mix for 7 Partition Dev & Alpha value 


= 80.4718 negative log likelihood 
A 


tba 
v2500 
.0000 
.4068 
.6667 
elZoZ 


Or OW OF 


of AS64000 96 7P for 7 Partition Dev & Alpha value 


= 100.6404 negative log likelihood 
A 
p2039 


a oy og 
-6667 


FOrF 


3.0000 
0.0323 


of AS6400 97 7P for 7 Partition Dev & Alpha value 
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APPENDIX E: GUIDANCE for USING PROGRAMS 


1. Preparing Data: 


a. Selecting and organizing desired data (AS6400_ 98, Academic Setback of 
course 6400 in 1998) by SAS Program Data Step (Appendix C) to produce 


the following two dimensions output. 


The SAS System Data Step Output from AS64000 98 








DAY COUNT DAY COUNT DAY COUNT DAY COUNT DAY COUNT 
1 0 34 0 oi 0 100 3 133 2 
2 0 35 0 68 0 101 0 134 0 
3 0 36 0 69 1 102 0 135 0 
4 0 a7 0 70 0 103 1 136 0 
5 0 38 0 71 0 104 1 137 0 
6 0 39 0 72 0 105 1 138 1 
7 0 40 0 73 0 106 0 139 . 
g 0 41 0 74 0 107 0 140 0 
9 0 42 1 75 0 108 0 141 0 
1 0 43 0 16 0 109 0 142 0 
i 0 44 0 77 0 110 0 143 0 
12 0 45 1 78 2 eat 0 144 0 
| 13 0 46 0 79 0 112 0 145 0 
14 0 47 0 80 0 113 0 146 1 
us f 48 0 g1 0 114 0 147 0 
6 1 49 0 82 2 15 0 148 0: 
17 0 50 0 83 0 116 0 149 0 
18 0 51 0 84 0 117 0 150 0 
19 0 52 0 85 1 118 = ese 0 
20 0 53 1 86 2 119 0 152 0 
21 0 54 0 87 0 120 0 153 0 
22 0 55 1 88 1 eal 0 154 0 
23 0 56 0 89 3 122 0 155 0 
24 0 57 0 90 0 123 0 156 0 
25 0 58 0 91 0 124 0 157 0 
26 0 59 0 92 1 125 0 158 0 
27 0 60 0 93 2 126 0 159 0 
28 0 61 1 94 0 127 0 160 0 
29 0 62 0 95 0 128 0 161 0 
30 0 63 0 96 1 129 0 162 0 
31 0 64 0 97 1 130 0 163 0 
32 0 65 1 98 0 ie 0 164 0 
33 2 66 0 99 0 £32 0 


& 
— 





b. Reorganizing the previous SAS Data output and saving in Formatted Text 
(Space delimited) format for programs read2.gam and read.gam. 


The input data of AS6400 D 98.prn for both read2.gam and read.gam 


Di 0 D34 0 D67 0 D100 S Diss 2 
D2 0 D35 0 D68 0 D101 0 D134 0 
D3 0 D36 0 D69 al D102 0 D135 0 
D4 0 D4 0 D70 0 D103 1 D136 0 
D5 0 D38 0 D711 0 D104 a Dis? 0 
D6 0 D39 0 D72 0 D105 uk D138 1 
D7 0 D40 0 D7 3 0 D106 0 D139 0 
D8 0 D41 0 D?74 0 D107 0 D140 0 
D9 0 D42 1 D75 0 D108 0 - D14la « OQ « 
D10 0 D43 0 D76© 0 D109 0 D142 0 
Dil 0 D44 0 D777 0 D110 0 D143 0 
D12 0 D45 1 D78 Z D111 0 D144 0 
Dis 0 D46 0 D79 0 D112 0 D145 0 
D14 0 D47 0 D80 0 D113 0 D146 1 
DiS 4 D48 0 D81 0 D114 0 D147 0 
D16 1 D49 0 D82 2 DLLs 0 D148 0 
D17 0 D50 0 D83 0 D116 0 D149 07 
D18 0 Doi 0 D384 0 Diy 0 D150 0 
D19 0 D52 0 D85 Ht Di18 2 BNisope 0 
D20 0 D53 1 D8 6 2 Dig 0 D152 0 
D21 0 D54 0 D87 0 D120 0 D153 0 
D22 0 D55 i D8 8 1 D1i21 0 D154 0 
D23 0 D56 0 D89 3 D122 0 DISS 0 
D24 0 D57 0 D90 0 D123 0 D156 0 
D25 0 D58 0 D91 0 D124 0 Dio 0 
D26 0 D59 0 D92 1 D125 0 D158 0 
D27 0 D60 0 bao 2 D126 0 Dis? 0 
D28 0 D61 a D94 0 Dizt- © «0 D160 0 
D29 0 D62 0 DIS 0 D128 0 D161 0 
D30 0 D63 0 DIE 1 D129 0 DLGZ: 30 
D31 0 D64 0 D97 a D130 0 Dios 0 
D32 0 D65 1 D98 0 Dis. 0 D164 0 
D33 2 D66 0 DOD 0 DiLsZ 0 


2. Calculating Maximum Negative Log Likelihood 


The program read2.gam (Appendix <A) uses the formatted data 
AS6400_D_98.prn as input to calculate the maximum negative likelihood of different 


policy. The following is an example output. 
The output of read2.gam for as6400 D 98.prn 


Value 
84.8867 
82.2978 
73.6616 
10% 9237 
67.9204 


“SHO & YW 
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8 66.5098 
] 62.5889 
10 So ee aol 


3. Selecting Best Choice of Partition 


The program read.gam (Appendix B) uses the formatted data AS6400_D_98.prn 


as input to select best policy combination. The following is an example output. 


The output of read.gam for as6400_D 98.prn 


VARIABLE TOTCOST.L = 73.6616 negative log likelihood 
PARAMETER output eis i | 
Xx A 
Ss .D14 1.0000 
D1i4 .D16 1.0000 Z2-0.0.00 
DEG: 2D77 1.0000 0.1475 
D777 .D105 1.0000 Uy-7 Sor 
D105 .D164 1.0000 


4. Calculating Deviance & Alpha Value 


The SAS program (Appendix C) Deviance & Alpha Value Step use first column 
of read.gam output which 1s the index of best choice as input to calculate deviance and 
look up Alpha value. The following 1s an example output. 


The SAS System Output of AS6400 98 
for 5 Partition Deviance & Alpha Value 


Obs dev Alpha 
1 Load ou 0.55624 
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